pHD-fll34  737  STATISTICAL  MODELING  OF  RAILROAD  SAFETY  PERFORMRNCE<0)  1/1 
ARMY  ARMAMENT  RESEARCH  AND  DEVELOPMENT  CENTER  ABERDEEN 
PROVIN.  .  B  A  BODT  ET  AL.  SEP  83  RRBRL-MR-03311 
UNCLriSSIFIED  SBI-AD-F300  342  bTFR53-82-X-00276  F/G  13/6  NL 


Destroy  this  report  when  it  is  no  longer  needed. 
Do  not  return  it  to  the  originator. 


Additional  copies  of  this  report  may  be  obtained 
from  the  National  Technical  Information  Service, 

U.  S.  Department  of  Commerce,  Springfield,  Virgin! 

22161. 


The  findings  in  this  report  are  not  to  be  construed  as 
an  official  Department  of  the  Army  position,  unless 
so  designated  by  other  authorized  documents. 


.lie  use  oj  tz’oae  "onaB  or  •’i  mu  factum  rs '  names  ir,  rnis  report 
Joea  not  aonetitute  injoruemert  of  zny  aomerc'ial  produce. 


security  classification  of  this  page  (UTimi  Dmtm  Bnfrmd) 


REPORT  DOCUMENTATION  PAGE 

READ  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

4.  TITLE  (mtd  Subtitim) 

STATISTICAL  MODLLING  OF  RAILROAD  SAFFTY  PERFORMANCE 

5.  TYPE  OF  REPORT  ft  PERIOD  COVERED 

6.  PERFORMING  ORG.  REPORT  NUMBER 

7.  author^*; 

Barry  A.  Bodt 

Jerry  Thomas 

B.  CONTRACT  OR  GRANT  NUMBERr*) 

•■performing  ORGANIZATION  NAME  ARO  ADDRESS 

US  Army  Ballistic  Researen  Laboratory 

ATTN;  DRSMC-BLBCA) 

Aberdeen  Proving  Ground,  MD  21005 

10.  PROGRAM  element.  PROJECT,  TASK 
AREA  A  WORK  UNIT  NUMBERS 

RDT6E  nTFR-53-82-X-0027(> 

1  1.  CONTROLLING  OFFICE  MAMF  AMD  ADDRESS 

US  Army  Armament  Research  §  Development  Command 

US  Army  Ballistic  Research  Laboratory  (DRDAR-BLA-S) 
Aberdeen  Proving  Ground,  MD  21005 

12.  REPORT  DATE 

September  1985 

12.  NUMBER  OF  PAGES 

36 

U.  monitoring  agency  NAME  •  AOORESS^I/  dllUtmtl  from  ConIroUIng  Otlle») 

15.  SECURITY  CLASS,  (ol  thim  rmport) 

Unclassified 

1$«.  DECLASSIFICATION/ DOWNGRADING 
SCHEDULE 

l«.  distribution  statement  (oI  tflia  Rafioro 


Approved  for  public  release;  distribution  unlimited. 


t7.  distribution  statement  Co/  mb9trmct  In  0f««k  20,  It  dUtmrmftt  from  R9por1) 


It.  SUPPLEMENTARY  NOTES 


i  19-  KEY  WORDS  fConlJnu*  on  fovoroo  old#  II  nocoooory  mn0  ItfonfllK  Sy  blocA  numboO 

S.Tfcty  I’rogr.im  Cluster  Analysis 

Railroad  Safety  Discriminant  Analysis 

Safety  Quantification  Statistical  Analysis 

Multivariate  Analysis 

20.  ABSTRACT  fCoofteuo  on  pororoo  oMb  II  nmtmmmmtf  avE  Itfonll^  by  block  nuvbor^ 

In  recent  years  the  Department  of  Transportation  and  the  Association  of  American 
Railroads  have  become  concerned  over  the  safety  performance  of  this  nation's 
railroads.  Recent  statistics  showed  that  the  frequency  of  accidents  was  alarm¬ 
ingly  high.  Acting  on  this  concern,  the  D.O.T.  asked  the  Ballistic  Research 
Laboratory  to  determine  how  railroads  could  be  made  safer.  One  apjiroach  to  this 
problem  would  be  to  develop  a  mathematical  model  which  would  allow  a  railroad's 
safety  to  be  expressed  as  a  function  of  safety  program  characteristics.  In 
_ _ (Cont '  d ) 


UNCLASSIFIED _ 

SECURITY  classification  or  THIS  PAOK(Wfc«n  Of  gfltfO 

Item  20.  (Cont'd) 

this  manner,  guidelines  for  spending  which  could  improve  a  railroad's  safety 
record  would  be  established. 

Several  statistical  techniques,  including  Cluster  Analysis,  Discriminant  Analy¬ 
sis,  and  Multiple  Linear  Regression, were  employed  in  the  development  of  this 
mathematical  model.  The  model  is  a  good  predictor  for  the  years  1976-1979, 
and  should  assist  railroads  in  developing  guidelines  for  spending  which  should 
improve  their  safety  performance. 


_ UNCLASSIFIED _ 

security  classification  of  this  PAGEflWiFn  Dmtm  Entered) 


TABLE  OF  CONTENTS 


INTRODUCTION . 

AVAILABLE  DATA . 

DATA  ANALYSIS  . 

A  SAFETY  MODEL . 

RESULTS  . 

CONCLUSION . 

ACKNOWLEDGEMENTS . 

BIBLIOGRAPHY . 

APPENDIX  A.  DETAILED  DATA  ANALYSIS  . 

1 .  Procedure  . 

a.  Phase  1  -  Define  Safety  . 

b.  Phase  2  -  Model  Safety  =  f(X)  . 

c.  Phase  3  -  Model  Validation . 

2.  Results  .........  . 

APPENDIX  B.  STATISTICAL  TECHNIQUES  . 

1.  Cluster  Analysis . 

2.  Discriminant  Analysis  . 

APPENDIX  C.  DATA  . 

1 .  Data  Sets  . 

2.  Safety  Performance  Indicators  . 

3.  Safety  Program  Indicators  . 

4.  Subject  Categories  in  the  Safety  Director  Survey 

5.  Additional  Variables  of  Interest . 


DISTRIBUTION  LIST 


Acc(5r,5lfln 


L  INTRODUCTION 


In  recent  years,  the  Department  of  Transportation  (D.O.T.)  and  the  Association 
of  American  Railroads  have  become  concerned  over  the  safety  performance  of  this 
nation^  railroads.  Recent  statistics  showed  that  the  frequency  of  accidents  was  alarm¬ 
ingly  high.  Acting  on  this  concern,  the  D.O.T.  asked  the  Ballistic  Research  Labora¬ 
tory  to  determine  how  railroads  could  be  made  safer.  One  approach  to  this  problem 
would  be  to  develop  a  mathematical  model  which  would  allow  a  railroad’s  safety  to  be 
repressed  as  a  funetbn  of  safety  ptt^ram  characteristics.  In  this  manner,  guidelines 
for  spending  which  could  improve  a  railroad’s  safety  record  would  be  established. 


U.  AVAILABLE  DATA 

Data  available  for  the  analysis  pertained  to  safety  performance  records,  safety 
program’s  size  and  safeQr  program  content  Of  these  data,  the  first  two  are  quantita¬ 
tive,  comprised  of  26  statistics  compiled  for  1979  by  the  Association  of  American  Rail¬ 
roads.  For  example,  some  possible  safety  descriptors  are  Accident  Frequency  and 
Accident  Severity.  Some  possible  safety  program  indicators  are  Estimated  Safety  Pro¬ 
gram  Costs  and  the  Number  of  Safety  Representatives.  The  third  data  source 
category,  safety  program  content,  consisted  of  subjective  variables  whose  values  were 
formed  as  follows.  A  survey,  conducted  under  D.O.T.  sponsorship,  addressing  safety 
related  topics  was  given  to  each  safety  director  of  IS  railroads.  Their  responses  were 
assigned  scores  based  on  how  well  they  matched  the  "ideal”  response.  These  scores 
became  the  values  for  this  category  of  data.  In  addition  to  the  1^9  data,  some  safety 
performance  and  program  information  from  years  1976-1978  was  made  available  for 
this  study.  The  data  sources  and  variable  definitions  can  be  found  in  Appendix  C. 


III.  DATA  ANALYSIS 

The  data  were  first  checked  for  accuracy  and  completeness,  correcting  obvious 
errors  and  eliminating  variables  where  insufficient  "good”  data  were  present  In  per¬ 
forming  this  task,  one  of  the  IS  railroads  was  eliminated  due  to  insufficient  data. 

The  next  step  in  the  data  analysis  was  to  classify  the  railroads  into  safety  groups 
according  to  their  safety  performance  variables.  In  order  to  do  this  objectively,  a  sta¬ 
tistical  technique  called  cluster  analysis  was  used.  This  is  a  multivariate  statistical  tech¬ 
nique  wherein  railroads  were  separated  into  groups  based  on  the  minimization  of  vari¬ 
ance  within  a  group  and  the  maximization  of  the  distance  (variance)  between  groups? 
To  be  objective,  the  number  of  groups  in  which  to  separate  the  railroads  was  not 
specified.  Using  cluster  analysis  on  the  1979  data,  the  railroads  were  separated  into 
two  groups  based  on  three  safety  performance  variables,  namely,  Injury  Costs, 
Accident  Frequency,  and  Accident  Severity.  From  an  intuitive  standpoint  this  group¬ 
ing  was  satisfactory.  That  is,  one  group  could  be  called  "safe"  railroads  and  the  other 
"unsafe? 

^ce  Appendix  B. 


5 


The  next  step  was  to  take  these  three  clustering  variables  and  combine  them  into 
a  single  index  that  could  be  called  Safety.  To  do  this,  another  statistical  technique, 
discriminant  analysis,  was  employed  based  on  the  results  of  the  cluster  analysis.  The 
discriminant  index  or  score  is  a  function  of  the  individual  safety  variables  whose  value 
increases  monotonicaliy  with  respect  to  each  of  the  component  variables,  thereby  indi¬ 
cating  poorer  safety  performencer 

Similar  procedures  were  used  on  the  variables  associated  with  the  safety  program. 
Thus,  two  groups  were  identified  using  two  variables  that  objectively  described  a  safety 
program  namely.  Safety  Cost  and  Safety  Staff.  The  table  below  shows  the  cross 
classification  of  the  safety  groups  and  the  safety  program  groups.  This  table  is  very 
encouraging  in  that  railroads  with  good  programs  tend  to  be  "safe;”  and  those  with 
poor  programs  lend  to  be  "unsafe."  * 


TABLE  1.  CROSS  CLASSIFICATION 

GOOD  SAFETY 
PROGRAM 

POOR  SAFETY 
PROGRAM 

GOOD  SAFETY 
RECORD 

4 

1 

POOR  SAFETY 
RECORD 

4 

5 

IV.  A  SAFETY  MODEL 

Using  multiple  linear  regression,  a  safety  model  was  developed  based  on  the 
safety  program  variables.  This  model  was  deemed  useful  based  on  the  1979  data. 
Therefore,  it  was  time  to  validate  the  model  using  previous  years*  data.  Due  to 
insufficient  1976-1978  data  for  Safety,  an  available  safety  variable  (Accident  Index) 
closely  related  to  the  discriminant  score.  Safety,  was  substituted  into  the  model  and 
corresponding  model  adjustments  were  made.  For  brevity.  Accident  Index  will  be 
referred  to  as  Safety.  The  validation  process  showed  that  the  model  was  not  useful  for 
predicting  the  1976-1978  data.  Investigation  into  the  cause  of  poor  prediction  led  to 
the  elimination  of  four  of  the  railroads,  "niese  four  railroads  are  different  for  some 
unknown  reason(s).  The  reason(s)  could  be  bad  data,  poor  reporting  procedures  or 
some  lurking  variable,  that  is  an  unaccounted— for— variable  having  an  influence  on 
response  data.  After  the  elimination  of  the  four  railroads,  one  of  the  three  safety  pro¬ 
gram  variables  dropped  out  of  the  model.  Further  analysis  revealed  that  separate 
regression  models  using  the  remaining  two  safety  program  variables  explained  at  least 
72%  of  the  variation  of  Safety  for  the  10  remaining  railroads  in  each  of  the  years 
1976-1979.  So  that  one  model  to  explain  Safety  for  each  of  the  four  years  could  be 
developed,  the  data  for  those  years  were  pooled  for  the  multiple  regression  analysis. 
The  resulting  model  was  able  to  explain  71%  of  the  variation  in  Safety  over  the  four 
years.  Figures  1-4  illustrate  the  predictive  powers  of  the  model  for  the  10  railroads 
over  each  of  the  four  years.* 


See  Appendix  A  Tor  a  more  detailed  discussion. 


PREDICTED  fiNO  OBSERVED  SftFETY 


O 

i 

i 


e 


TERR  1976 


LE6END 
OBSoivco  siren 
piroicns  SRFm 


O -e 


PREDICTED  AND  OBSERVED  SAFETY 


9 

6 


YEAR  1977 


LEGEND 

o-OBSERVEa  snrcTT 
e**  PREDICTED  SflFEn 


SflrETY 


pm 


i 


P 


c-\- 


PREOICTEO  AND  OBSERVED  SAFETY 


9 

i 


8 


9 

i  0 


YEAR  1979 


LEGEIND 

O^’oescRvcD  arm 
e~pRcoicTco  arm 


T- 

s 


6 


-1 - 1 - r 

7  8  9 

RAILROAD 


10 


n 


12 


9 

(b 


13 


14 


-T" 

15 


Figure  4 


10 


V.  RESULTS 


The  maihemaiical  model  developed  to  predict  railroad  safety  as  a  function  of 
safety  program  characteristics  is  given  below.  Keep  in  mind  that  small  values  for 
Safety  are  preferred. 

Safety  =  3.342  —  ISflS {Efficiency)  +  .033(Equlpment  Load)  +  \  .{Y&irly  Bias) 

where  Yearly  Bias  is  equal  to  -353,  -311,  .058,  .000  for  1976  through  1979  respec¬ 
tively. 

Safety  is  defined  as  the  square  root  of  (  the  number  of  r^>orted  injuries  times  the 
number  of  days  lost  due  to  irUury  divided  by  200)  divided  by  (the  number  of  man¬ 
hours  employed  times  .(XXK)05).  Efficiency  is  defined  as  1/1  (XX)  times  the  ratio  of 
revenue  ton  miles  to  man-hours  employed.  Equipment  Load  is  defined  as  1/1(KX) 
times  the  ratio  of  ton  miles  to  train  hours.  Using  the  specified  model  to  improve 
Safety,  the  Equipment  Load  should  be  reduced  and  the  Efficiency  increased. 
Efficiency  was  one  of  several  profitability  arxl  efficiency  indices  considered  for  use  at 
the  recommendation  of  Mr.  Edward  O.  Baicy,  BRL’s  liaison  with  D.O.T.  It  was  felt 
that  a  favorable  overall  efficiency  rating  would  tend  to  indicate  efficient  safety  program 
management  as  well.  How  to  increase  Efficiency  may  or  may  not  be  obvious  to  a  rail¬ 
road. 

The  following  table  indicates  how  much  the  10  railroads  could  improve  their 
safety  performance  by  increasing  Efficiency  and  decreasing  Equipment  Load.  The 
improvement  is  given  in  terms  of  money  saved  due  to  fewer  work-days  lost  due  to 
injury.  The  estimate  is  based  on  the  improvement  to  be  made  by  the  average  1979 
railroad  and  uses  the  functional  relationship  between  Severity  Rate  and  Work-Days- 
Lost.  The  average  Efficiency,  average  Equipment  Load  and  average  Man-Hours- 
Employed  were  used  to  characterize  the  average  railroad.  A  work  day  was  taken  to  be 
worth  $80,  this  was  based  on  a  yearly  salary  of  $20,0(X)  and  250  work-days.  The  aver¬ 
age  railroad  lost  approximately  $2,592,182  In  1979  due  to  Work-Days-Lost  caused  by 
injury. 

Note:  cost  of  improvements  should  be  subtracted  in  order  to  obtain 
"true"  savings. 


11 


TABLE  2.  SAFETY  IMPROVEMENT 


Efficiency 

(%avr.) 

Load 

(%avr.) 

Safety 

Improvement 

Savings 

($K) 

%  Savings 

100% 

100% 

2.63 

120% 

100% 

2.21 

.42 

$618 

24% 

100% 

80% 

2.35 

.28 

$413 

16% 

120% 

80% 

2.07 

.56 

$1000 

39% 

VI.  CONCLUSION 

A  mathematical  model  has  been  developed  to  predict  Safety  as  a  function  of  t>vo 
quantitative  safety  program  indicators  using  the  data  obtained  on  10  railroads  for  the 
years  1976-1979.  The  model  indicates  that  a  savings  of  a  million  dollars  is  possible  for 
an  average  railroad  by  improving  EfBciemy  by  20%  and  decreasing  Equipment  Load  by 
20%.  Although  a  "good"  fit  for  the  1976-1979  data  was  obtainecL  a  word  of  caution  is 
in  order  about  the  use  of  this  model  for  predicting  Safety  for  future  years.  This  is  due 
to  the  uncertainty  of  what  values  the  explanatory  variables.  Efficiency,  Equipment 
Load,  and  Yearly  Bias,  will  assume  for  a  future  year.  There  are  other  quantitative 
>ariables  that  could  possibly  improve  the  model.  However,  we  were  not  successful  in 
obtaining  the  data  for  the  individual  railroads.  If  further  analysis  is  performed  on  data 
obtained  in  the  ISlSO's,  it  is  suggested  that  additional  quantitative  variables  be  added  to 
the  dau  base. 

ACKNOWLEDGEMENTS 

The  authors  would  like  to  acknowledge  Edward  O.  Baicy,  Linda  L.  Crawford, 
Jock  O,  Grynovicki,  Paul  V.  King  and  John  F.  Polk  Jr.  for  their  suggestions  and  assis¬ 
tance  in  completing  this  analysis. 


12 


BIBLIOGRAPHY 


1.  Belsley«  D.A.,  Kuh,  E.  and  Welsch,  R£.,  Regresa'o,  Diagnostics.  John  Wiley  & 
Sons,  Inc.,  New  York,  1980. 

2.  Dixon,  W.J.  and  Brown  M.B.,  editors,  BMDP  Biomedical  Computer  Programs  P- 
series.  University  of  California  Press,  Los  Angeles,  1979. 

3.  Draper,  N.R.  and  Smith,  H.,  Appiied  Regression  Analysis.  John  Wiley  &  Sons,  Irx:., 
New  York,  1981. 

4.  Graybill,  F.A.,  Theory  and  Application  of  the  Linear  Model.  Duxbury  Press,  North 
Scituate,  Massachusetts,  1976. 

5.  Morrison,  DJ.,  Multivariate  StatiOical  Methods,  McGraw-Hill  Book  Company, 
New  York,  1976. 

6.  Srivastava,  M.S.  and  Khatri  C.G.,  An  Iraroduction  to  Multivariate  Statistics.  Elsevier 
North  Holland,  Inc.,  New  York,  1979. 


i 

I 

t 


* 

) 

I 

i 

i 

! 

t 


3 


APPENDIX  A.  DETAILED  DATA  ANALYSIS 


To  establish  guidelines  for  spending  which  could  improve  a  railroad’s  safety 
record,  a  mathematical  model  expressing  Safety  as  a  function  of  safety  program  vari¬ 
ables  had  to  be  established.  In  order  to  arrive  at  this  model,  extensive  data  analyses 
were  performed.  The  following  is  a  sequential  accounting  of  the  analyses  performed. 


1.  Procedure 

1.  Objectively  define  Safety  as  a  function  of  one  or  more  possible  safety  indicators. 

2.  Model  Safety  —  f(X)  where  X  is  a  vector  of  safety  program  indicators  and  f  is  a 
function  defined  over  X.  This  model  is  to  be  based  on  1979  data. 

3.  Validate  the  established  model  for  data  from  1976-1978. 

Each  was  to  be  completed  in  a  separate  phase  of  the  analysis,  contingent  upon  the  suc¬ 
cessful  completion  of  the  previous  phases. 


a.  Phase  1  *  Define  Safety 

To  begin  phase  one  of  the  analysis,  quality  of  the  data  was  assessed.*  Due  to 
insufficient  data,  one  railroad  was  eliminated  from  the  aruiiysis.  To  eliminate  possible 
bias  in  the  definition  of  Safety,  it  was  first  assumed  that  each  of  the  safety  indicators 
was  equally  likely  to  be  considered  the  most  informative  about  a  railroad’s  safety  per¬ 
formance.  Assuming  that  among  railroads  there  existed  differences  in  safety  perfor¬ 
mance,  the  object  here  was  to  determine  which  subset  of  the  several  safety  indicators 
best  illustrated  those  differences.  Using  a  statistical  technique,  cluster  analysis,  the 
railroads  were  objectively  separated  into  two  groups  exhibiting  differences  in  safety 
performance.  This  separation  is  said  to  be  objective  because  cluster  analysis  makes  no 
judgement  about  the  two  groups  except  that  they  are  different.  Not  until  values  for 
safety  indicators  widtin  those  groups  were  examined,  was  one  group  classified  as  hav¬ 
ing  poor  safety  records  and  the  other  classified  as  having  good  safety  records.  Those 
safety  indicators  which  showed  significantly  different  means  between  the  good  and 
poor  safety  record  groups  were  taken  to  be  the  most  informative  safety  indicators. 
The  three  indicators  chosen  were  Injury  Costs,  Accident  Frequency,  and  Accident 
Severity. 

Though  three  important  safety  indicators  had  been  determined.  Safety  had  not  as 
yet  been  defined.  Using  a  statistical  technique,  discriminant  analysis,  information 
from  all  three  indicators  could  be  condensed  to  one  discriminant  score.  This  score  is 
an  indication  of  how  firmly  a  railroad  is  implanted  in  its  assigned  group,  in  this  case 
how  good  or  how  poor  the  railroad  is.  Hence,  this  discriminant  score  was  chosen  to 
represent  Safety. 

^ecinpcndix  c. 


17 


The  next  step  was  to  insure  that  the  variable  Safety  made  sense  intuitively  in  its 
relation  to  the  safety  program  indicators.  At  this  point,  a  variable.  Safety,  had  been 
established,  with  which  one  could  classify  the  14  railroads  in  terms  of  their  safety 
records. 

In  a  similar  fashion,  using  safety  program  indicators  a  variable  Safety  Program 
was  established,  with  \vhich  one  could  classify  the  14  railroads  as  being  varying  degrees 
of  good  and  poor  in  terms  of  their  safety  programs.  Intuitively,  one  would  like  those 
railroads  with  good  safety  programs  to  have  good  safety  records  and  those  with  poor 
programs  to  have  poor  records.  Table  1.,  in  the  body  of  this  report,  indicates  that  for  5 
of  the  14  railroads  this  was  not  the  case.  In  the  establishment  of  the  variables  Safety 
and  Safety  Program,  an  artificial  dichotomy  was  created  between  good  and  poor  safety 
records  and  safetj-  programs.  In  reality,  a  gray  area  exists  between  the  two  groups, 
though  not  sufBciently  so  that  the  creation  of  a  third  middle  group  is  warranted. 
Three  of  the  five  railroads  yielding  counterintuitive  results  were  within  this  gray  area. 
Hence,  only  two  of  fourteen  railroads  yielded  strictly  counterintuitive  results.  This 
gave  us  increased  confidence  in  our  objective  formation  of  the  variable  Safety  so  that 
we  could  then  proceed  to  the  modeling  phase  of  the  analysis. 


b.  Phase  2  -  Model  Safety  ■=*  f(X) 

It  was  desired  to  establish  a  functional  relationship  between  the  safety  record  of  a 
railroad  and  its  safety  program.  The  dependent  variable  Safety  was  formed  from  the 
three  most  informative  safety  record  indicators.  Injury  Costs,  Accident  Frequency  and 
Accident  Severity.  The  variables  Safety  Cost  and  Safety  Staff  were  the  safely  program 
indicators  deemed  most  informative  in  separating  railroads  on  the  basis  of  safety  pro¬ 
gram  quality.  These  two  were  determined  as  such  during  the  process  in  which  Safety 
Program  was  established.  It  was  reasonable  to  assume  that  a  function  of  Safety  Cost 
and  Safety  Staff  would  aid  in  explaining  the  variation  in  Safety  among  railroads.  This 
assumption  was  put  to  practice  in  a  multiple  linear  regression  model.  It  became 
apparent,  when  performing  the  regression  analysis,  that  Safety  Cost  and  Safety  Staff 
were  strongly  correlated.  To  avoid  muiticolinearity,  a  condition  which  inflates  regres¬ 
sion  coefficient  variances.  Safety  Cost  was  eliminated  from  the  model.  The  stronger 
variable.  Safety  Staff  was  able  to  explain  52%  of  the  variation  in  Safety  among  rail¬ 
roads  using  the  model: 

Safety  —  —9.706  +  10.173  Ln  {.Safety  Staff). 

In  examining  the  model  it  is  useful  to  note  that  low  values  of  Safety  and  Safety  Staff 
indicate  good  safeq^  records  and  good  safety  programs, respectively. 

For  the  purpose  of  improving  the  model  and  gaining  use  of  the  available  infor¬ 
mation,  the  remaining  quantitative  safety  program  indicators  were  included  in  the 
analysis.  The  resulting  regression  model  could  be  given  as 


ifi 


Safety  —  72.94  +  1.%1  Ln  {Safety  Stuff)  ~  IS.MLn  {Efficiency) 
•^-T. SO Ln {Equipment  Load). 


This  model  exceeded  the  previous  simple  linear  regression  model  by  explaining  68.S% 
of  the  variation  among  railroads.  The  interpretation  of  Safety  and  Safety  Staff  are  the 
same  as  before.  According  to  the  model,  to  provide  for  a  better  Safety  value  a  rail¬ 
road  must  be  made  more  efBcient  and  its  equipment  must  be  under  less  load. 

Late  in  this  phase,  the  qualitative  results  from  the  safety  director  survey  were 
made  available  for  inclusion  in  the  model.  The  analysis  suggested  that  two  of  the 
three  quantitative  variables  in  the  previous  model  be  replaced  by  one  of  the  qualitative 
variables.  The  new  model  could  be  given  as 

Sqfety - 4.53  -I-  S.9SLn{Safety  Stqff)  -  {Hazard  Control) 

Safety  and  Safety  Staff  can  be  interpretated  as  before.  Hazard  Control  is  defined  as 
hazard  control  techrK>logy  employed  by  the  railroad.  Limited  Hazard  Control  would 
cause  a  railroad’s  safety  performance  to  decline.  This  new  model  explains  62%  of  the 
variation  of  Safety  among  railroads.  Unfortunately,  there  is  no  similar  data  for  other 
years  with  which  this  last  model  can  be  validated. 


c.  Phase  3  •  Model  Validation 

To  validate  the  1979  quantitative  model  over  the  years  1976-1978,  a  variable  sub¬ 
stitution  had  to  be  made.  Although  attempts  were  made  to  gather  needed  data,  infor¬ 
mation  necessary  for  the  formation  of  Safety  using  discriminant  analysis  was  not  avail¬ 
able  for  1976-1978.  For  this  reason,  a  reasonable  substitute  deperxlent  variable  had  to 
be  considered  for  the  validation  phase.  One  of  the  most  informative  safety  perfor¬ 
mance  variables  was  not  included  in  the  discriminant  analysis  because  it  was  a  function 
of  Accident  Frequency  and  Accident  Severity  which  were  already  incorporated  in  the 
analysis.  This  variable.  Accident  Index,  was  then  related  to  Safety  and  was  used  as 
the  dependent  variable  for  the  1976-1978  data.  Hence,  the  quantitative  1979  model 
was  reformulated  in  terms  of  Accident  Index,  leaving  the  three  independent  variables 
as  before.  The  amount  of  explained  variation  for  this  model  was  64%,  approximately 
the  same  as  when  using  Safety.  It  was  also  true  that  when  using  Accident  Index  for 
the  cluster  analysis,  the  railroads  were  partitioned  in  the  same  manner.  Furthermore, 
when  regressing  Accident  Index  on  the  1979  data,  the  same  three  independent  vari¬ 
ables  prove  to  be  the  most  important  to  the  model.  For  these  reasons,  it  was  thought 
that  the  substitution  was  reasonable.  The  validation  phase  then  consisted  of  trying  to 
predict  Safety! Accident  Index)  for  1976-1979  using  the  new  1979  model  with  Accident 
Index  as  the  dependent  ^ariable. 

Upon  completion  of  the  modeling  phase  of  the  analysis,  data  was  supplied  for  the 
years  1976-1978  with  which  to  validate  the  model  established  with  the  1979  data.  If 
the  model  performed  well  over  these  four  years,  confidence  in  it  would  increase. 


1 


Unfortunately,  the  model  did  an  extremely  poor  job  predicting  Safety  over  those  four 
years.  The  model  predicted  best  for  1978  where  it  was  able  to  explain  only  24%  of  the 
variation  of  Safety  among  the  14  railroads. 

Analysis  then  turned  toward  trying  to  establish  whether  or  not  the  model  had  any 
useful  significance.  Although  the  attempt  to  predict  Safety  was  fruitless,  it  was 
thought  that  the  model  may  preserve  the  relative  rankings  of  the  railroads  from  good 
to  poor  on  the  basis  of  Safety.  The  predicted  values  of  Safety  and  the  actual  values  of 
Safety  were  used  to  establish  predicted  rankings  and  actual  rankings  of  the  railroads.  - 

These  rankings  were  studied  for  similarities,  but  none  were  found. 

The  next  step  taken  was  to  see  if  any  of  the  independent  variables  in  the  1979 
model  seemed  to  have  any  bearing  on  Safety  for  each  of  those  three  years.  To  this 
end,  three  separate  multiple  linear  regressions  were  performed,  one  for  each  year, 
where  Safety  for  each  year  was  taken  as  the  dependent  variable  and  Safety  Staff, 

Efficiency,  and  Equipment  Load  were  taken  as  the  independent  variables.  Obviously,  if 
they  had  a  bearing,  the  explained  variation  of  Safety  among  railroads  for  each  of  those 
three  years  would  be  high.  Once  again,  however,  poor  light  was  shed  on  the  1979 
model  because  its  independent  variables  did  not  seem  to  have  any  relationship  with 
Safety  in  the  other  three  years.  The  regression  coefficients  for  these  three  models 
were  examined  for  informative  trends,  but  none  were  found. 

Residual  analysis  ^^as  performed  on  these  three  models.  It  was  observed  that 
four  railroads  consistently  had  high  residuals  (predicted  Safety  subtracted  from 
observed  Safety).  Furthermore,  it  was  observed  that  for  each  of  the  four  railroads, 
the  predicted  value  of  Safety  was  consistently  less  or  consistently  more  than  the 
observed  value  of  Safety  for  those  three  models.  Assuming  that  the  models  did  in 
fact  have  merit,  these  four  railroads  were  classified  as  being  different  due  to  possible 
reporting  inconsistencies  and  were  temporarily  eliminated  from  the  analysis. 

Based  on  the  remaining  ten  railroads  a  new  model  was  formed  for  Safety  as  a 
function  of  the  three  safety  program  indicators.  This  model  was  then  used  to  predict 
Safety  for  the  other  three  years.  The  predictive  powers  of  this  model  were  much 
better  than  its  predecessor  in  1977  and  1978  but  remained  poor  for  1976.  It  was 
interesting  to  note  that  the  safety  program  indicator  deemed  most  informative.  Safety 
Staff,  did  not  significantly  add  any  information  to  the  model. 

Four  separate  regression  models  were  then  formed  for  each  of  the  years  1976- 
1979.  The  four  regression  models  each  explained  72%  or  more  of  the  variation  of 
Safety  among  the  ten  railroads.  This  indicated  that  there  was  a  relationship  between 
Efficiency,  Equipment  Load,  and  Safety  while  Safety  Staff  again  added  no  information 
to  the  models.  So  that  one  model  to  explain  Safety  for  each  of  the  four  years  could  be 
developed,  the  data  for  ir.ose  years  were  pooled  for  a  multiple  linear  regression. 


1 


Jo 


2.  Results 


The  resulting  model  was  able  to  explain  71%  of  the  variation  in  Safety  over  the 
four  years.  The  predictive  powers  of  the  model  are  illustrated  in  Figures  1-4  in  the 
main  body  of  the  report.  Keep  in  mind  that  small  values  for  Safety  are  preferred. 
The  model  is  given  below. 

Safety  —  3.342  —  ISflS {Efficiency)  +  {Equipment  Load)  +  \.{Yearly  Bias) 
where  Yearly  Bias  equals  -.353,  -.211,  .058,  .000  for  1976  through  19791  respectively. 


APPENDIX  B.  STATISTICAL  TECHNIQUES 


1.  Cluster  Analysis 

Cluster  analysis  was  incorporated  in  the  analysis  in  the  following  fashion.  It  was 
assumed  that  there  existed  a  difference  among  railroads  in  terms  of  their  safety  perfor¬ 
mance.  It  was  also  assumed  that  each  of  the  available  safety  performance  variables, 
were  equally  likely  to  explain  the  safety  performance  difference  among  railroads. 
Cluster  analysis  is  a  multivariate  technique  which  aliows  one  to  separate  multivariate 
data  into  k  populations  based  on  Euclidean  distance.  The  many  observations  taken  on 
each  of  the  railroad's  safety  performance  comprised  the  multivariate  data  with  which 
to  perform  cluster  analysis.  Railroads  within  each  population  were  similar  in  that  they 
were  positioned  close  together  in  Euclidean  n-space.  A  difference  in  values  among 
populations  for  a  safe^  performance  variable  was  said  to  exist  if  the  hypothesis  of 
equality  of  means  among  groups  was  rejected  with  an  F-test  of  significance  .05.  Obvi¬ 
ously,  those  safety  performance  variables  showing  a  difference  in  value  among  popula¬ 
tions  contributed  more  to  the  separation  than  did  those  safety  performance  variables 
which  showed  no  difference  among  populations.  For  this  reason  those  variables  were 
considered  the  most  informative  in  discriminating  among  railroads  in  a  safety  perfor¬ 
mance  sense.  By  examining  primarily  the  values  of  those  informative  variables  among 
populations,  one  population  of  railroads  was  considered  to  have  better  safety  perfor¬ 
mance  than  another.  In  this  manner  railroads  were  classified  as  good  or  poor  and 
important  safety  performance  indicators  were  identified.* 


2.  Discriminant  Analysis 

Discriminant  analysis  had  application  in  the  following  sense.  In  the  case  of  safeQr 
performance,  two  populations,  good  and  poor,  were  determined  using  duster  analysis. 
Furthermore,  three  safety  program  variabies  were  determined  important  in  that  popu¬ 
lation  determination.  It  was  reasonable  then  that  some  combination  of  these  three 
variables  could  be  used  to  represent  SafeQr,  a  univariate  value  describing  the  safety 
performance  of  railroads.  Discriminant  analysis  is  a  procedure  with  which  one  can 
discriminate  between  populations  of  multivariate  data.  It  does  so  in  terms  of  a  univari¬ 
ate  value  formed  by  first  establishing  a  mathematical  dividing-line  achieving  maximum 
separation  between  the  two  groups  which  were  established  using  duster  analysis.  The 
univariate  value  Is  then,  in  a  sense,  the  distance  away  from  the  dividing-line  for  each 
railroad.  With  this  distance,  the  discriminant  score,  the  railroads  could  be  dassified. 
Those  railroads  whose  discriminant  value  placed  them  below  the  aforementioned 
mathematical  dividing-line  were  classified  as  good,  and  those  whose  discriminant  value 
placed  them  above  the  dividing-line  were  dassified  as  poor.  A  railroad's  relative  safety 
performance  is  thus  indicated  by  its  distance  from  the  dividing-line.  More  important 
than  its  classification  properties,  the  discriminant  score  served  as  a  reasonable  depen¬ 
dent  variable  for  a  regression  model  established  using  information  from  several  safety 
performance  indicators. 

*A  simSar  procedure  wu  used  Tor  safeor  prograin  indicaion  . 


25 


APPENDIX  C.  DATA 


1.  Data  Sets 

Data  used  in  these  analyses  came  from  the  following  sources.  Paul  King,  a  con¬ 
sultant  to  RRL,  compiled  the  Grst  three  data  sets  listed.  Sets  four  and  Cve  were  com¬ 
piled  by  the  Association  of  American  Railroads  (A.A.R.).  The  sixth  set  was 
comprised  of  the  results  of  the  Safety  Director  Survey  which  were  determined  by  a 
committee  consisting  of  Edward  O.  Baicy  (BRL),  Paul  King,  and  several  railroad  safety 
directors. 

1 .  Railway  Operating  Expense  Rates  -  Class  I  Roads  1979 

2.  Comparison  -  Efliciency  /  Profitability  Indices  -  Class  I  Roads  1979 

3.  Accident  &  Iruury  Statistical  Data  -  Class  I  Roads  1979 

4.  A.AJI.,  Class  I.  Railroads  Operating  and  Traffic  Statistics  1976-1979 

5.  A.A.R.,  Rankings  of  Class  I.  Railroads  1976-1979 

6.  Safety  Director  Surv'ey  1979 


2.  Safety  Performance  Indicators 

1 .  equipment  damage  expense  /  gross  operating  revenue 

2.  equipment  damage  expense  /  man-hours  employed 

3.  reported  accident  damage  /  gross  operating  revenue 

4.  reported  accident  damage  /  man-hours  employed 

5.  wreck  clearing  expenses  /  gross  operating  revenue 

6.  wreck  clearing  expenses  /  man-hours  employed 

7.  injury  costs  (injury  payoffs  /  gross  operating  revenue) 

8.  injury  payoffs  /  man-hours  employed 

9.  accident  and  injury  costs  /  gross  operating  revenue 

10.  accident  and  injury  costs  /  man-hours  employed 

1 1 .  accident  frequency  (#  of  reported  injuries  /  (man-hours  employed  x  .000005)) 

12.  number  of  lost-work-day  injuries  /  number  of  reported  injuries 

13.  accident  severity  (lost-work-days  /  (man-hours  employed  x  .000005)) 

14.  accident  index  (SQRT( accident  frequency  x  accident  severity  /  200)) 


3.  Safety  Program  Indicators 

1.  safety  staff  ((man-hours  employed  x.000001)  /  #  of  safety  representatives) 

2.  efficiency  (.001x(revenue  ton  miles  /  man-hours  employed)) 

3.  safety  cost  (estimated  safety  program  costs  /  man-hours  employed) 

4.  freight  revenue  /  freight  expenses 

5.  equipment  load  (.001x(ton-miles  /  train-hour)) 

6.  train-miles  /  train-hour 

7.  percentage  of  freight-car-miles  loaded 

8.  total  investment  /  miles  of  track 

9.  track  miles  operated  /  man-hours  employed 

10.  number  of  locomotives  /  man-hours  employed 

11.  number  of  safety  representatives  /  number  of  employees 

12.  track  miles  operated  /  track  miles  operated  for  14  railroads 


4.  Subject  Categories  in  the  Safety  Director  Survey 

1 .  section  organization  program  value 

2.  staffing  position  value 

3.  documentation 

4.  signature  authority 

5.  program  content 

6.  decision  authority 

7.  operating  relations 

8.  skill  resource 

9.  equipment  and  facility  resource 

10.  available  safety  equipment 

1 1 .  monetary  resource 

12.  reviews,audits  and  inspections 

13.  procedures 

14.  correct  action 

15.  accident  reporting  and  analysis 

16.  safely  training 

17.  safety  motivation  programs 

18.  hazard  control  technology 

19.  general  action  and  procedures 

20.  recommendations 

21.  past  actions 

22.  current  actions 


23.  safety  survey  score 


5.  Additional  Variables  of  Interest 

The  following  safety  program  indicators  have  been  identified  as  possible  oontrib 
utors  to  a  better  mode!.  At  the  time  of  the  writing  of  this  report,  this  information 
was  unavailable  for  the  IS  individual  railroads. 

1.  hours  of  maintenance  on  freight  cars  /  freight-car  hours 

2.  hours  of  maintenance  on  passenger  cars  /  passenger-car  hours 

3.  hours  of  maintenance  on  locomotives  /  locomotive  hours 

4.  safety  director’s  salary  /  top  executive’s  salary 

5.  implemented  suggestions  resulting  from  accident  investigations 

/  number  of  accidents 

6.  recommended  suggestions  /  implemented  suggestions 

7.  number  of  thorough  maintenance  inspections  /  number  of  cars 

8.  number  of  thorough  maintenance  inspections  (track)  /  1(X)  track  miles 

9.  safety  director’s  time  spent  on  safety  /  safe^  director’s  time 

10.  safety  personnel’s  time  spent  on  safeQr  /  safety  personnel’s  time 

1 1 .  investment  in  track  and  equipment  /  miles  of  track 

12.  miles  of  track  in  city  /  miles  of  track 

13.  track  requiring  reduced  speed  due  to  conditbn  /  miles  of  track 


DISTRIBUTION  LIST 


No.  of  No.  of 

Copies  Organization  Copies  Organization 


12  Administrator 

Defense  Technical  Info  Center 
ATTN:  DTIC-DDA 
Cameron  Station 
Alexandria,  VA  22314 

1  Commander 

US  Army  Materiel  Development 
and  Readiness  Command 
ATTN:  DRCDMD-ST 
5001  Eisenhower  Avenue 
Alexandria,  VA  22333 

1  Commander 

Armament  Research  g  Development 
Center 

US  Army  Armament,  Munitions 
§  Chemical  Command 
ATTN:  DRSMC-TDC(D) 

Dover,  NJ  07801 

2  Commander 

Armament  Research  8  Development 
Center 

US  Army  Armament,  Munitions 
8  Chemical  Command 
ATTN:  DRSMC-TSS(D) 

Dover,  NJ  07801 

1  Commander 

US  Army  Armament,  Munitions 
8  Chemical  Command 
ATTN:  DRSMC-LEP-L 
Rock  Island,  IL  61299 

1  Commander 

Armament  Research  8  Development 
Center 

US  Army  Armament,  Munitions 
8  Chemical  Command 
Benet  Weapons  Laboratory 
ATTN:  DRSMC-LCB-TL 
Watervliet,  NY  12189 

1  Commander 

US  Army  Aviation  Research 
and  Development  Command 
ATTN:  DRDAV-E 

4300  Goodfellow  Blvd. 

St.  Louis,  MO  63120 


1  Director 

US  Army  Air  Mobility  Research 
and  Development  Laboratory 
Ames  Research  Center 
Moffett  Field,  CA  94035 

1  Commander 

US  Army  Communications  Rsch 
and  Development  Command 
ATTN:  DRSEL-ATDD 
Fort  Monmouth,  NJ  07703 

1  Commander 

US  Army  Electronics  Research 
and  Development  Command 
Technical  Support  Activity 
ATTN:  DELSD-L 
Fort  Monmouth,  NJ  07703 

1  Commander 

US  Army  Missile  Command 

ATTN:  DRSMI-R 

Redstone  Arsenal,  AL  35898 

1  Commander 

US  Army  Missile  Command 
ATTN:  DRSMI-YDL 
Redstone  Arsenal,  AL  35898 

1  Commander 

US  Army  Tank  Automotive 
Command 

ATTN:  DRSTA-TSL 
Warren,  MI  48090 

1  Director 

US  Army  TRADOC  Systems 
Analysis  Activity 
ATTN:  ATAA-SL 
White  Sands  Missile  Range 
NM  88002 

2  Commandant 

US  Army  Infantry  School 
ATTN:  ATSH-CD-CSO-OR 
Fort  Banning,  GA  31905 

1  AFWL/SUL 

Kirtland  AFB,  NM  87117 


35 


DISTRIBUTION  LIST 


No.  of 

Copies  Organization 

1  Commander 

US  Army  Research  Office 

ATTN:  DRXRO-MA,  Dr.  Robert  Launer 

P.O.  Box  12211 

Research  Triangle  Park,  NC  27709 

10  Department  of  Transportation 

Federal  Railroad  Administration 
ATTN:  Mr.  Don  Levine 
RRD  33 

7th  and  D  Streets,  S.W, 

Washington,  D.C.  20S90 

Aberdeen  Proving  Groxind 

Dir,  USAMSAA 

ATTN:  DRXSY-D 

DRXSY-MP,  H.  Cohen 
Cdr,  USATECOM 

ATTN:  DRSTE-TO-F 
Cdr,  USACRDC,  Bldg.  E3516,  EA 
ATTN:  DRSMC-CLB-PA 
DRSMC-CLN 
DRSMC-CU-L 


USER  EVALUATION  OF  REPORT 


Please  take  a  few  minutes  to  answer  the  questions  below;  tear  out 
this  sheet,  fold  as  indicated,  staple  or  tape  closed,  and  place 
in  the  mail.  Your  comments  will  provide  us  with  information  for 
improving  future  reports. 

1 .  BRL  Report  Number _ 


2.  Does  this  report  satisfy  a  need?  (Comment  on  purpose,  related 
project,  or  other  area  of  interest  for  which  report  will  be  used.) 


3.  How,  specifically,  is  the  report  being  used?  (Information 
source,  design  data  or  procedure,  management  procedure,  source  of 
ideas,  etc.) _ 


4.  Has  the  information  in  this  report  led  to  any  quantitative 
savings  as  far  as  man-hours/ contract  dollars  saved,  operating  costs 
avoided,  efficiencies  achieved,  etc.?  If  so,  please  elaborate. 


5.  General  Comments  (Indicate  what  you  think  should  be  changed  to 
make  this  report  and  future  reports  of  this  type  more  responsive 
to  your  needs,  more  usable,  improve  readability,  etc.) _ 


6.  If  you  would  like  to  be  contacted  by  the  personnel  who  prepared 
this  report  to  raise  specific  questions  or  discuss  the  topic, 
please  fill  in  the  following  information. 


Name: 


Telephone  Number: 
Organization  Address: 


-  FOLD  HERE  - 


Director 

US  Army  Ballistic  Research  Laboratory 

ATTN:  DRDAR-BLA-S 

Aberdeen  Proving  Ground,  MD  21005 


OFFICIAL  BUSINESS 

PENALTY  FOR  PRIVATE  USE.  S300 


BUSINESS  REPLY  MAIL 

FIRST  CLASS  PERMIT  NO  12062  WASHINGTON, DC 

postage  will  be  paid  by  department  of  the  army 


Director 

US  Army  Ballistic  Research  Laboratory 

ATTN :  DRDAR- BLA-S 

Aberdeen  Proving  Ground,  MD  21005 


NO  POSTAGE 
NECESSARY 
IF  MAILED 
IN  THE 

UNITED  STATES 


